NMF-based speech enhancement incorporating deep neural network

نویسندگان

  • Tae Gyoon Kang
  • Kisoo Kwon
  • Jong Won Shin
  • Nam Soo Kim
چکیده

Recently, lots of algorithms using machine learning approaches have been proposed in the speech enhancement area. One of the most well-known approaches is the non-negative matrix factorization (NMF) -based one which analyzes noisy speech with speech and noise bases. However, NMF-based algorithms have difficulties in estimating speech and noise encoding vectors when their subspaces overlap. In this paper, we propose a novel speech enhancement algorithm which uses deep neural network (DNN) to improve the encoding vector estimation of the NMF-based technique. A DNN is trained to represent the mapping from noisy speech to corresponding encoding vectors. The quality of the enhanced speech from the proposed NMF-based scheme adopting DNN-based encoding vector estimation is compared with that from the conventional NMF-based technique. The experimental results showed that the proposed speech enhancement algorithm outperformed the conventional NMF-based speech enhancement technique.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigating NMF speech enhancement for neural network based acoustic models

In the light of the improvements that were made in the last years with neural network-based acoustic models, it is an interesting question whether these models are also suited for noise-robust recognition. This has not yet been fully explored, although first experiments confirm this question. Furthermore, preprocessing techniques that improve the robustness should be re-evaluated with these new...

متن کامل

Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization

This paper presents a statistical method of single-channel speech enhancement that uses a variational autoencoder (VAE) as a prior distribution on clean speech. A standard approach to speech enhancement is to train a deep neural network (DNN) to take noisy speech as input and output clean speech. Although this supervised approach requires a very large amount of pair data for training, it is not...

متن کامل

Deep Recurrent Neural Network Based Monaural Speech Separation Using Recurrent Temporal Restricted Boltzmann Machines

This paper presents a single-channel speech separation method implemented with a deep recurrent neural network (DRNN) using recurrent temporal restricted Boltzmann machines (RTRBM). Although deep neural network (DNN) based speech separation (denoising task) methods perform quite well compared to the conventional statistical model based speech enhancement techniques, in DNN-based methods, the te...

متن کامل

Combining Bottleneck-BLSTM and Semi-Supervised Sparse NMF for Recognition of Conversational Speech in Highly Instationary Noise

We address the speaker independent automatic recognition of spontaneous speech in highly variable noise by applying semisupervised sparse non-negative matrix factorization (NMF) for speech enhancement coupled with our recently proposed frontend utilizing bottleneck (BN) features generated by a bidirectional Long Short-Term Memory (BLSTM) recurrent neural network. In our evaluation, we unite the...

متن کامل

A DNN-HMM Approach to Non-Negative Matrix Factorization Based Speech Enhancement

General speaker-independent models have been used in nonnegative matrix factorization (NMF) based speech enhancement algorithms for the practical applicability. And additional regulation is necessary when choosing the optimal models for speech reconstruction. In this paper, we propose a novel utilization of deep neural network (DNN) to select the models used for separating speech from noise. Sp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014